1^'f  l^flEW  YORK  UNIVERSITY 
'  COURANT  INITiTUTi  =  LIBRARY 
25tM«r«rSt.   New  York,  N.Y.  10012 


NEW   YORK   UNIVERSITY 
COURANT   INSTITUTE   OF 
MATHEMATICAL   SCIENCES 


IMM-NYU  334 
FEBRUARY  1965 


OPTIMAL  ROBUSTNESS  FOR 
ESTIMATORS  AND  TESTS 

ALLAN  BIRNBAUM  AND  EUGENE  LASKA 


PREPARED  UNDER 
CONTRACT  NONR  285(38) 
NR  042-206/11/18/64 


^    U 


IMM-NYU-53^ 
February  19  65 


New  York  University 
Courant  Institute  of  Mathematical  Sciences 


OPTIMAL  ROBUSTNESS  FOR  ESTIMATORS  AND  TESTS 

Allan  Birnbaum 
Eugene  Laska 


This  report  represents  results  obtained  at  the  Courant 
Institute  of  Mathematical  Sciences  under  the  sponsorship 
of  the  Office  of  Naval  Research  under  Contract  No.  Nonr  285(58). 
Reproduction  in  whole  or  in  part  is  permitted  for  any  purpose  of 
the  United  States  Government. 

* 

Research  Facility,  Rockland  State  Hospital,  Orangeburg,  N.  Y. 


NEW  YORK  UNIVERSITY 
COURANT  INSTITUTE  -  LIBRARY 
is  I  Mercer  St.   New  York,  N.Y.  10012 


C.  1 


1.    Introduction  and  summary.   Tukey  (1960,1962)  has  provided 
a  broad  perspective  for  research  in  efficiency-robustness  of 
estimators,  as  well  as  an  Important  part  of  the  knowledge  avail- 
able in  this  area.   The  present  paper  is  intended  to  complement 
these  by  supplying  formulations  of  concepts,  techniques,  and  ini- 
tial results  for  optimally  efficiency-robust  estimators  and  tests 
in  several  tsrpes  of  problems.   Relations  to  Tukey' s  investigation 
are  discussed  in  Section  2,  with  brief  reference  to  the  related 
work  of  Huber  (1964).   Relations  to  the  approach  to  robust  esti- 
mation of  Hodges  and  Lehmann  (1963)  are  discussed  in  Section  5« 
The  present  approach  may  be  described  as  a  formal  indexing 
of  alternative  specifications  (e.  g.  "shapes"  of  error-distribu- 
tions )  by  a  nuisance  parameter,  and  adaptation  of  admissibility 
and  related  concepts  and  Bayes  techniques  of  the  Neyman- Pears on 
and  Wald  theories  to  the  estimation  and  testing  problems  thus 
formulated.   Specific  problems  for  which  new  optimal  efficiency- 
robust  estimators  are  given  are:   linear  estimation  of  location 
parameters  (Section  2);  rank  tests  and  related  estimators  for 
two-sample  problems  (Section  3);  and  unbiased  estimation 
(Section  4).   A  by-product  included  in  Section  4  is  a  generaliza- 
tion of  Stein's  (1950)  characterization  of  locally-best  unbiased 
estimators  to  the  class  of  adinissible  unbiased  estimators  togeth- 
er with  the  corresponding  complete  class  theorem. 


2.   Linear  unbiased  estimation  of  location  parameters.   Let  X 

be  a  random  variable  with  p.d.f.   f  ( (x-|x  )/j-,  A  ),   where  the  finite 

2 
variance  <r    ,   mean  |x,   and  shape  parameter  A  are  unknown  but 

have  the  specified  ranges  -oo  •<  ij,  <  oo,  cr   >  0,  A  ■— A  •   Foi*  each 

A  we  assiome  that  the  density  function  is  symmetric.   Let 

(X,,...,X  )  denote  n  Independent  observations  on  X,   and  let 

Y  =  (Y,,...,Y  )  denote  the  same  observations  ordered  nondecreas- 

Ingly.   We  consider  the  problem  of  estimation  of  m,,   restricting 

consideration  to  linear  unbiased  estimators   (LUEs),  that  is,  es- 

n 

tlmators  of  the  form  |j,*  =  >    a.y.   for  which  E(m,*{Y)  ||j.,'r,  A )  =  \i 

identically  in  |x,  cr,      and  AG  A*   Estimators  will  be  appraised 
in  terms  of  their  variance  functions  var(M,*||x,.7pv).  When  /\  con- 
sists of  a  single  point  (i.  e.  the  shape  is  known),  the  problem  Is 
reduced  to  one  solved  by  Lloyd  (1952),  who  derived  best  linear  un- 
biased estimators  (BLUEs)  (of  o"  as  well  as  ii,   without  the  as- 
sumption of  symmetry  made  here).  With  A  unknown,  the  problem 
leads  to  considerations  which  are  conveniently  illustrated  first 
in  the  artificially  simple  case  that  A  contains  just  two  ele- 
ments, and  <r    is  known.   For  example  the  two  shapes  might  be  nor- 
mal and  double-exponential, 

f((xni),l)  =  — ^  exp  -  |(x-n)^,   f((x-M,),2)  =  I  exp  -  Ix-nl, 
v2Tr 

or  normal  and  5  ^^  "Contaminated  normal  as  defined  in  Tukey's  work. 
2.1  Illustrative  discussion.   Among  the  estimators  of  possible 
interest  in  any  such  simplified  problem,  let  us  consider  initially 
the  BLUE  estimators,  to  be  denoted  by  fl    A  =  1  or  2,   and  their 


variances  var(il,  |X'),  ?s,A*  =  1,2,   which  turn  out  to  be  inde- 
pendent of  \JL     and  a-  (as  seen  in  Lloyd's  results  of  their  exten- 
sion below).   Since  these  are  BLUE  estimators  under  respective 
shapes,  we  have  var(il^|l)  <  varCiigll)  and  var(il2|2)  <  var(il^|2). 
If  the  first  relation  were  an  equality,  jlp  would  be  a  uniformly 
best  linear  unbiased  estimator  (UBLUE)  over  /\,   and  would  pro- 
vide a  simple  and  ideal  solution  to  our  problem.   In  cases  of  typ- 
ical interest  all  such  inequalities  are  strict,  and  more  compli- 
cated considerations  must  be  faced.   Several  such  cases  are  il- 
lustrated in  Figures  1-4,  in  which  each  estimator  n*  is  repre- 
sented by  its  variance  function  var(|a,*|A),  A  =  1,2,  plotted  as 
the  point  with  the  latter  as  respective  coordinates.   (These  con- 
siderations are  in  part  analogous  to  familiar  decision-theoretic 
discussions  of  the  convex  set  of  "a, p"  points  in  the  problem  of 
testing  between  two  simple  hypotheses. )  In  the  hypothetical  ex- 
ample of  Figure  1,  BLUE  jl,   has  variance  function   (1,4),   and 
BLUE  (Ip  has  variance  function  (6,2).   Thus  use  of  il^  risks  a 
four-fold  increase  of  variance  in  case  X  =  2  is  the  true  shape, 
and  use  of  jlp  risks  a  three-fold  increase  of  variance  in  case 
X  =  1  is  true.   Since  each  of  these  estimators  is  uniquely  best 
under  one  shape,  each  alternative  estimator  is  represented  by  a 
point  of  the  form  (l+a,2+b)  with  both  a  and  b  positive;  the 
general  goal  is  to  seek  estimators  for  which  both  a  and  b  are 
small. 

Definitions  (not  restricted  to  case  of  two-point  A)*   Among  LUEs, 
M,*  is  naturally  called  better  than  \i**     if  var(|j,*|A)  <  var(M.**t>«l 


over  /~\  with  strict  inequality  for  at  least  one  A.  \i*     is 
called  an  admissible  linear  unbiased  estimator  (ALUE,  with  respect 
to  A)  if  no  other  estimator  is  better. 

Under  certain  general  but  not  universal  conditions  attention 
may  be  restricted  without  loss  to  admissible  estimators,  in  the 
sense  that  for  each  xnadmisslble  estimator  there  exists  at  least 
one  better  admissible  estimator.   It  is  proved  below  that  the  risk 
points  of  the  ALUEs  constitute  a  convex  curve  connecting  the  risk 
points  of  the  BLUEs,  as  in  Figures  I-5. 

Depending  upon  the  structure  of  a  problem  such  as  that  repre- 
sented in  Fig.  1,  the  ALUEs  might  be  represented  by  convex  curve 
passing  very  near  the  "ideal"  point  (1,2);  or  by.  one  lying  very 
near  the  line-segment  connecting  the  BLUE  points  as  in  Fig.  2.   In 
the  first  case  any  estimator  with  risk  point  very  near  (1,2)  would 
naturally  be  called  highly  efficiency-robust  since  it  has  effi- 
ciency nearly  100%  uniformly  over  /;.   In  the  second  case,  it  is 
seen  that  uniformly  high  efficiency  is  unattainable,  but  that  ef- 
ficiency at  least  54%  is  provided  by  an  ALUE  with  risk  point  near 
(1.3,  5. 68);  here  BLUE  jl,   provides  efficiency  bounded  below  by 
only  50  .^ot   and  m.^  ^Y   only  I/6.   In  problems  having  a  third  type 
of  structure,  one  BLUE  estimator  may  have  efficiency  nearly  100  % 
uniformly  over  /\;      in  such  a  case  further  consideration  of  other 
ALUEs  would  be  of  minor  practical  importance.   In  such  comparisons 
of  estimators,  it  may  be  of  interest  to  consider  the  formal  cri- 
terion of  maximin  efficiency  (the  maximum  attainable  lower  bound 
of  efficiency  of  a  LUE  over  A);  OJC*  the  criterion  of  minimax 
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variance  (in  problems  formulable  with  some  common  scale  unit   for 
different  shapes).   However  the  three  types  of  structures  of  sim- 
plified problems  described  serve  to  illustrate  the  limited  value 
of  any  single  criterion  or  measure  of  efficiency-robustness,  and 
the  fact  that  practical  interest  leads  in  principle  to  considera- 
tion of  (a)  the  configuration  of  risk  points  of  BLUE  estimators; 
and  in  many  cases  also   (b)  the  configuration  of  risk  points  of 
ALUE  estimators. 

While  problems  in  which  /^«.  contains  only  several  points  are 
artificially  simplified  versions  of  more  realistic  problems,  it 
may  be  good  research  strategy  to  begin  efficiency-robustness  in- 
vestigation of  a  realistic  model  /\  by  preliminary  consideration 
of  one  or  several  such  simplified  versions.   If  for  example  tv;o 
shapes  are  selected  for  preliminary  study,  from  among  those  in  a 
model  of  practical  interest;  and  if  a  structure  like  that  of 
Figure  2  is  found,  in  which  maxirain  efficiency  is  far  below  unity; 
then  it  follows  that  in  the  original  problem  embracing  additional 
shapes  the  maximin  efficiency  can  be  no  higher  (in  general  it  will 
be  lower).   Such  negative  results  found  early  in  an  investigation 
may  provide  economical  redirection  of  research  efforts.   On  the 
other  hand,  positive  results  found  in  simplified  problems  provide 
tentative  encouragement  and  suggest  specific  estimators  worth  con- 
sidering further  in  the  original  problem.   (Hypothetical  examples 
can  be  seen  by  retrospective  consideration  of  the  results  in  Tukey 
(i960).  Figure  k,   p.  ^6o,  considering  pairs  of  shapes  such  as 
7=0  or  .05,  For  each  such  pair,  the  estimators  considered 
can  be  represented  in  a  figure  analogous  to  Figures  I-3. ) 
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The  body  of  knowledge  of  efficiency-robustness  available  at 
any  stage  can  be  viewed  usefully  in  terms  of  the  headings: 

(a)  Negative  results:  pairs,  or  larger  sets,   /\  of  shapes  over 
which  uniformly  high  efficiency  is  unattainable. 

(b)  Positive  results:  sets  of  shapes  /"x  over  which  one  can  at- 
tain certain  fairly  high  lower  bounds  on  efficiency,  together  with 
construction  of  corresponding  estimators.   (The  5%  truncated  mean 
represented  in  Fig.  k   of  Tukey  (i960)  is  a  significant  example. ) 

(c)  Intermediate  cases  A. 

Of  course  practical  judgments  concerning  the  range  /-,,   and 
concerning  gains  or  sacrifices  in  terms  of  variances  or  efficien- 
cies at  various  A  when  estimators  are  compared,  are  crucial  to 
practically  useful  formulations  of  problems  and  selections  of  es- 
timators.  Tukey' s  papers  introduce  a  specific  "contaminated 
normal"  family  of  distributions  as  a  model  pertinent  to  many  prac- 
tical problems  of  estimation  of  location  and  scale,  together  with 
discussion  of  the  background  of  experience  and  theory  which  sug- 
gest such  a  model.   (Cf.  also  Bahler  (196^).) 

The  class  of  estimators  (of  location  and  scale)  investigated 
by  Tukey  for  possible  robustness  of  asymptotic  efficiency  under 
this  model  were  LUEs.   It  had  been  shown  that  restriction  to  such 
estimators  does  not  generally  reduce  asymptotic  efficiency  in 
cases  of  kno^m  shapes,  (Blom  (1958));  and  considerations  described 
by  Tukey  focused  attention  particularly  on  certain  of  these  (trun- 
cated (or  trimmed)  or  Wlnsorized  means  for  location).   Evidently 
the  attainability  of  uniformly  high  efficiency  of  estimation  of 
location,  even  in  an  unrestricted  class  of  estimators,  was  not 
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guaranteed  a  priori  by  any  theoretical  knov;ledge  available,  but  It 
was  fouzid  that  certain  truncated  means  were  uniformly  highly  asymp- 
totically efficient  over  the  given  A .   In  this  case  there  would 
be  little  to  be  gained  by  considering  ALUEs  over  the  given  class; 
more  precisely,  for  specific  finite  sample  sizes  for  which  Tukey's 
asymptotic  results  effectively  hold,  there  would  be  little  to  gain 
by  considering  other  estimators  -  but  in  other  cases  here,  and  more 
generally  in  other  investigations,  the  construction  of  some  ALUEs 
and  computations  of  their  variances  can  be  considered.  (Ruber's 
(1964),  pp.  74-5*  comments  on  the  relative  merits  of  variances  and 
asymptotic  variances  are  pertinent  here. ) 


2.2  Derivations.   Let  U^  =  (X^-M.)/<r  and  V^  =  (Y^ni)/r, 
for  r  =  l,...n.   The  moments  of  the  standardized  ordered  observa- 
tions V   are  independent  of  m-  and  <r-     but  in  general  depend  up- 
on ?\ :   Let 


a^  =  E(V^|M.,cr,A),   and  co^^  ^  =  Cov(V^,  Vg|M.,cr,  A  ) 


for  r,  s  =  l,...n.   Then  for  the  original  ordered  observations  we 
have 

E(Y^|m.,o-,A)  =h  +  <^a^,   and  Cov(  Y^,  Y^  ||a,o-,  A  )  ='~"^"3^^s- 

Lloyd's  method  of  constructing  BLUE  estimators  can  be  applied 
here  to  construct  ALUE  after  application  of  the  usual  formal  Bayes 
technique:   Let  G  =  G(A)  denote  an  arbitrary  cumulative 


probability  distribution  function  over  A/  a-^^d  let 

The  latter  Is  the  p.d.f,  of  a  sample  (X  ,oo.X^)  taken  under  one 
or  another  shape  A,  with  X  assigned  values  randomly  according 
to  distribution  G.   We  have 


E(Y^|H, <r,G)  =  /  E(Y^|M.,cr,A)dG(?v) 


iix+^a^)   dG(A)  =  U.+  <ra^  , 


where  a^  =  /  a^  dG(A), 


and 

Cov(y^Yg|ii,T-,G)  =  E(Y^Yg|M.,r-,G)  -  (u+>ra^) (n+'Ta^) 

=  /  ^^4s  "^  (M'+^«r)(l^+''^«s^]dG(A)  -  (M.+^tx^)(n+'as)] 

+  \ia^J   (aW)dG(A)  -  n^ 

y  a^dGU)^  a^dG(A)  -  iirrj  (aV^)dG(X) 
I  y  a)^:gdG(A)+j  aVdG(A)-y  a^dGO.)  J   a^dG(A)j 


-C.2 


^2 


'^^"^r,s'  ^^^' 
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Thus  the  conditions  of  the  Gauss-Markoff  theorem  are  satisfied,  for 
minimum-variance  linear  vinbiased  estimation  of  m,,   i^hen  the  values 

n  p 

of  a   and  cd  „  are  known, 
r       r,  s 

Let  y  =  (yi,...y^)',   co^  =  ^4,s^'   ^G  =  ^^^^'^* 
co^  =  (^r,s)^   «^  =  (aj,...a^)t,   a^  =  (a^,  . .  .ajj)  t ,   i  =  (i,l,...i)i, 
an  (nxl)  vector,   p  =  (]^,a  ),   and  0  =  (|j,,<r)».  Then 


2  G 


E(YliJ,,a-,G)  =  pG  and  Cov(Y|ij,,<7-,  G)  =<y  o:>. 


The  best  (minimum  variance)  unbiased  estimator  linear  in  the  y  's 
is,  by  the  Gauss-Markoff  theorem. 


-1 


9^  =  (p'DqP)  -^p^D^y 


and  the  covariance  matrix  of  the  estimator  is 


Cov{QQ\\l,<r,G)    ='r2(pi];)^p) 


-1 


=  (T" 


-1*1 


V^G  i  i'^G 


—      u  u 


In  the  important  case  of  f  ( (x-p,  )/r7-,  A  )  symmetric  about  0 
for  each  A,   to  which  our  further  discussion  and  results  are 
limited,  we  have  more  simply 


e„  =  [l'Dpy/1'DU,  a^*Dpy/a^'D„a^] 


'G^. 


G 


,'(-'. 


t  VCs^ 
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and 


Cov(eQ|ix,vr,G)  =  c^' 


1/1' D^  1 


l/a^'D^a^   , 


and  Dq  and  co   are  symmetric;  and  a.  =  -a  j^  for  1  =  l,...n. 
Restricting  consideration  now  to  estimation  of  m-  we  find 


EChqIx)  =  1»Dq(ix1+  a^)/l«DQl  =  \i. 

Thus  for  each  G,  ix„     is  an  unbiased  linear  estimator  of  [i     under 
each  shape  A  —  /\ » 

For  our  purposes  each  such  estimator  is  represented  by  its 
variance  function  (Var(il«|(x,i~, A)  = 


^[Vi>Q^\l/iV\V^], 


which  is  independent  of  [i.      This  function  of  A  may  be  inter- 
preted as  a  "risk  point"  characterizing  jlp,   in  a  space  vjith  co- 
ordinate axes  indexed  by  the  respective  AS  A  s^^d  with  corre- 
sponding coordinates  given  by  the  preceding  variance  function,  as 
in  the  simplified  examples  of  Figures  1-5  above.   To  prove  that 
each  p,Q  is  an  admissible  linear  unbiased  estimator,  it  suffices 
to  note  that  it  is  a  unique  Bayes  solution  for  the  G-ralxture. 
Hence  m-q  uniquely  minimizes  Var(|j.*||i,<r,  G)  over  a  class  of  line- 
ar estimators  which  includes  all  those  tmbiased  for  each  A.   If 
U-n     were  inadmissible,  there  would  exist  a  LUE  iJ.'   with 
Vardi' I M-^cr, A)  <  Var([lQ||i,'r, A)  for  each  A  and  hence  with 
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Var{[i^  \[x,(7',G)   <   Var(tXQ||JL,cr,G),   contradicting  that  ijIq  uniquely 
attains  the  latter  minimum. 

To  prove  that  the  set  of  risk  points   Yar{\iQ\\i,o-',7\),      'K^/\, 
of  such  admissible  LUE's  constitutes  a  convex  hypersurface  (  or 
convex  curve  as  in  the  simplified  examples  of  Figures  1-3)  consider 
for  each  pair  of  ALUE's  \x*,   [i**     and  each  number  q,   0  <  q  <  1, 
the  "mixed"  estimator  [i' :   for  each  y,  M-My)  is  assigned  the 
value  tx*(y)  with  probability,  q  and  the  value  n**(y)  v;lth 
probability  1-q,   Clearly  |j,»   is  unbiased  and  has  risk  point 

Yar{[x^\\i,^r,-K)   =  q  Var(M.*|p,,7,  A)  +  (1-q)  Var([X**|n,  r,  A ). 

The  latter  formula  shows  that  the  risk  points  of  all  mixed  LUE's 
constitute  a  convex  set,  which  includes  the  set  of  risk  points  of 
ALUE^s.   Next,  let  us  consider  again  the  "Bayes  risk",  to  be  mini- 
mized as  above  but  with  respect  to  this  larger  class  of  "mixed" 
estimators:  We  have 

Var(n»  ||x,o-,G)  =J   Var(ix«  |t^. /»>^)dC-(A  ) 

=  q  /  Var(n*|n,(r,A)dG(A)  +  ^1"^)  /  Var{!i**In,a",  A  )dG{A ) 

=  q  Var(ix*|ti,rr,G)  +  (1-q)  Var(|x**  |(x,(r,G). 

Thus  the  problem  of  minimizing  the  latter  function  in  the  class  of 
"mixed"  estimators  m-'  is  seen  to  reduce  to  the  problem  solved 
above,  and  to  be  ansv;ered  by  the  (non-mixed)  ALUE  ia^.   Thus  the 


lo 


(yj 
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latter  has  a  risk  point  which  is  a  "lower"  boundary  point  of  the 
convex  set  referred  to  above.   Arguments  of  continuous  variation  in 
the  class  of  distributions  G  (which  are  elementary  in  the  case  il- 
lustrated in  Figures  1-3)  show  that  the  set  of  such  risk  points  not 
only  lies  in,  but  constitutes,  the  convex  lower  boundary  hypersur- 
face  of  that  set. 

Example.   If  f\      contains  just  two  points,   A  =  0,1,   any  G  may 
be  represented  by  g  =  Prob  [A=6],  1-g  =  Prob  [a=13,   and 

f((x-n)/ir,g)  =  gf((x-|j.)/-,0)  +  (l-g)  f((x-n)/-,l). 

E(y)  =  n+':>^(ga°+(l-g)a^) 

Cov(yY>)  =a'2[gcD»  +  (l-g)cu°  H-  g(l-g)(a^-a°){a^-a°)^]. 


It  is  readily  found  that  for  n  =  3,  "^a-  ~   ^    ^-Ya      with 


^1   =  ^3  =      ,1   o     .    1...0      ,  „o      .  ^...o    ^   ^    ,,_^         ^2^1  +  2^3  ^  ^"22  ^  20^2^ 


g(|coJl  +  |cdJ^  +  (^2   -  2(0^2 )   +   (1-s) 


and  Bp   =  l-2a, . 


Here,  as  might  be  expected  in  this  very  simple  case,  the  il  's  are 

S 

respective  weighted  averages  of  jj,   and  [1,  .   It  is  not  clear 
whether  this  relation  holds  for  larger  n  and  for  more  general  /^., 


r-  n,' 


.V  .  ^    s^    r-.j 


. .    •  ■    K'- 


r  - 
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If  the  two  shapes  are  normal  and  double-exponential, 

f((x-M.),0)  =  -^   exp  .'-  ^(x-n)2l  , 

f((x-nU)  =  I  exp  |-|x-|j.!|  , 

we  have  the  ALUE' s  and  their  risk  points  given  in  Table  1  and 
Figure  3.   The  moments  from  which  these  were  calculated  are  avail- 
able in  Hastings,  et  al  {19'^7)  and  Sarhan  (1954). 

TABLE  1 


g 

^1  =  ^3 

^2 

var(|j.      A=0) 

0 

var(ix^  A=l) 

0 

.148 

.704 

.532 

.591 

.25 

.226 

.548 

.407 

.605 

.50 

.275 

.450 

.351 

.628 

.75 

.309 

.332 

.343 

.649 

1.00 

.333 

.333 

.333 

.668 

Here,  as  could  be  seen  from  the  risk  points  of  Just  il   and  ji,, 
use  of  il   (or  else  another  ALUE  with  a  nearby  risk  point)  is  rec- 
ommended by  the  fact  that  achievement  of  any  major  part  of  the 
small  decrease  possible  in  VarCjl  |7\=1)  costs  a  large  increase  in 

O 

Var([lg|A=0). 


Var(ii^|A=0) 


.5^ 


.50 


AS 


A2 


.58 


.54 
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Figure   3 

RISK  POINTS  OF  A   COMPLETE  CLASS  OF 
ADMISSIBLE  LINEAR  UIxBIASED  ESTIMATES 


/\\\\ i_ 


.58 


.62 


.66 


7 


Var(ilgU=l) 


l6 


For  any  given  family  of  symmetric  shapes  £{{x-[x)/(r,J\) , 
?\ '^  A*   "the  properties  of  ALUE»s  \i       are  characterized  by  their 

•  o 

respective  variance  fvinctlons    [i'D^co  Dl.  1  /(l^'D„  1_)  ].  Use  of 
these  functions,  and  the  formulae  for  the  estimators  themselves. 


Hq  =  I'D^y/tr  Dq  , 

depend  upon  availability  of  the  moments  a  ,ai   for  respective 
shapes  and  sample  sizes  of  interest;  and  upon  feasibility  of  the 
indicated  matrix  computations  for  at  least  several  distributions 
G  in  each  problem  to  be  explored.   Availability  of  such  moments 
for  shapes  representing  plausible  error-distributions  is  increasing. 
(e.  g.  Greenberg  and  Sarhan  (1962),  Birnbaum  and  Dudman  (1965)); 

but  for  such  Important  cases  as  the  contaminated  normal  family  mo- 

G    — T 
ments  are  not  available.   The  computation  of  D^  =  (cu  )  "^  from 

given  a  »s,cd  's>   and  G  is  evidently  generally  heavy,  even  when 
/\  contains  only  several  points;  conceivably  these  might  be  facil- 
itated by  discovery  of  algebraic  simplifications  and  use  of  large 
scale  computers. 

It  will  often  be  natural  to  consider  families  of  shapes  which 
are  convex,  in  the  sense  that  for  each  q,   0  <  q  <  1,   if 
ASA"s  a   then  the  shape 

qf((x-n)/<r,?.'  )  +  (l-q)f((x-M.)/<3^,A") 

Is  also  in  A*   The  contaminated  normal  family,  for  example,  has 
this  property,  which  v/as  also  assiomed  in  the  investigation  of 
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Btlhler  (1964).   (The  property  clearly  cannot  hold  when  A  con- 
tains a  finite  number  of  points. )   An  interesting  open  question  is 
v/hether,  for  such  convex  families  A*   each  ALUE  ix„     is  also  a 
BLUE  tl,   for  some  A  ~  /'\ . 

3.   Efficiency-robust  two-sample  rank  tests  and  estimators. 
3.1   Introduction.   Let  x^^x^,  . , .  ,x     ,Y-,)72_''"'^n       ^®  ' 

N  =  n,  +  n^  independent  random  variables^  the  x.'s  having  common 
unlmoim  absolutely  continuous  c.d.f.   P^  and  the  y . '  s  having 
common  unknown  absolutely  continuous  c.d.f.   P^.   Let 
^(x,  0)  =  g(x,  0)  be  a  specified  one-parameter  family  of  densitsr 
functions  with  0  taking  values  in  an  open  interval.  ¥e  consider 
the  problem  of  testing  the  simple  hj^pothesis 


H^:   P^  =  G(x,<f^),  Py  =  G(y,*^) 


against  a  composite  alternative 


Yi^:      P^  =  G(x,©),   P^  =  G(y,'^) 


where  0  and  ^  are  any  values  in  R  for  which  0  >  <J>  >  4). 

.  o  . 

It  Is  v/ell  known  that  the  tests  which  are  valid  (i.  e.  which 
have  specified  size  under  H   for  all  continuous  G)  are  just  the 
rank  tests,  those  based  on  the  rank  order  statistics 


Nj 


1  if  the  J    smallest  of  the  pooled 
sample  of  N  observations  is  an  x, 

0  otherwise. 
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Such  nonparametrlc  two-sample  tests  might  be  described  as  "univer- 
sally validity-robust".   In  sharpest  contrgst  with  the  breadth  of 
this  validity  property,  the  efficiency  theory  of  such  tests  has 
been  restricted  exclusively  to  considerations  of  povrer  under  one 
or  another  assumed  known  form  for  G.   In  the  present  section  we 
extend  this  theory  systematically  to  the  case  of  G(x, e^A)  where 
e  remains  the  parameter  of  primary  interest  and  where  the  (nui- 
sance) "shape"  parameter  A  has  an  unlcnovm  value  in  a  specified  set 
/\.   To  Illustrate  by  reference  to  familiar  rank  tests,  it  is  well 
known  that  the  Fisher  test  is  uniquely  locally-best  if  G  is  nor- 
mal, and  that  the  Wilcoxon  test  is  uniquely  locally-best  if  G  is 
logistic.   Thus  use  of  either  test  risks  a  certain  loss  of  effi- 
ciency in  case  that  shape  holds  for  which  it  is  not  best.   Hence 
it  is  of  much  theoretical  and  practical  interest  to  ask  what  other 
tests  should  be  considered  here,  with  a  view  to  attaining  the  high- 
est possible  efficiencies  under  the  two  possible  shapes.   Iiitroduc- 
tory  and  interpretive  remarks  here  will  be  few  because  they  vjould 
be  for  the  most  part  direct  analogues  of  Section  2.1  above,  but 
some  comments  v;ill  be  made  here  with  particular  reference  to  rank 
tests,  and  in  Section  3.5  below  with  reference  to  corresponding 
estimators. 

It  has  sometimes  been  found  that  LMPRT  has  some  efficiency- 
robustness  properties  (e.  g.  the  Wilcoxon  test,  LMPRT  for  logistic 
shapes,  is  also  highly  efficient  for  normal  shapes).  Wliile  such 
results  are  valuable  when  found,  it  seems  of  practical  value  for 
other  efficiency-robustness  problems,  and  of  general  theoretical 
value,  to  provide  systematic  theory  and  techniques  for  efficiency- 


io^ 


iSa 


r£ov'  p.  i'    d"!    ,!•■ 


ei 
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robustness  in  this  area. 

One  type  of  question,  which  is  rather  basic  to  the  comparison 
of  parametric  and  non-parametric  approaches  in  general,  is  illus- 
trated by  the  following  particular  results  (due  to  Chernoff  and 
Savage  (I958)  and  Capon  (1961)0  The  Fisher  rank  test  is  known  to 
be  asymptotically  efficient,  as  compared  with  the  two-sample  t-test, 
under  normal  shapes;  and  analogous  efficiency  holds  for  the 
Wllcoxon  test,  as  compared  with  the  parametric  likelihood-ratio 
test  under  logistic  shapes,  as  it  does  for  all  LMPRTs  compared  to 
the  corresponding  likelihood  ratio  test.   Further,  the  Fisher  test 
is  asymptotically  preferable  to  the  t-test  in  the  sense  that  its 
efficiency  relative  to  the  latter  is  at  least  unity  for  all  shapes 
(under  mild  assumptions);  however  other  LMPRTs  fail  to  have  the 
analogous  property:   There  are  shapes  under  which  the  relative  ef- 
ficiency of  the  LMPRT  to  its  parametric  analogue  falls  below  unity. 
In  such  appraisals  of  given  rank  tests,  for  relative  efficiency  in 
relation  to  respectively  parametrlcally-best  tests  for  various 
families  of  shapes,  Iz   v;ould  seem  natural  and  promising  to  extend 
consideration  to  rank  tests  characterized  by  optimal  efficiency- 
robustness. 

3.2  Efficiency-robust  rank  tests.   Capon  (I961),  making  use  of  a 
general  theorem  of  Hoefding,  has  shown  that,  under  certain  regular- 
ity conditions,  the  locally  most  powerful  rank  test  (LMPRT)  is 
given  by  a  critical  function  of  the  form 


•u      'Jl 


^tti 


■•.i.^m--' 


■•^ . .  ■  'Vi-  - ' 


0^!    bO'i^ 
.iZ^'i   IB!:: 


fi       C^l- 


5(VX. 


.OX  Ij.^ 


r1 


'loi'i 
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0  if  T(Zj^^,...,z.^^)  <  C^ 


^(z)  =  1   if  T(z   ,...,Zj^^)  >  C^ 


k^  if  T(Zj^^,...,z^j^)  -  C^ 


where 


N 


Tm  -  T^t(z)  -  "^(2^^,  ...,z^^)  -    2_^  ^Ni^Ni 


■N   "N^ 


^1  i=l 


o  c 


^%  log  g(w.,e) 


e^'i 


th 
and  vjhere  W.   is  the  i    smallest  of  the  N  observations.   The 

notation  Eq^k(')  indicates  expects tion  is  taken  under  the  assump- 
tion that  the  x' s  and  y's  are  distributed  according  to  g(x, 0) 
and  g{y, *)  respectively. 

The  size  a„   of  the  test  is  given  by 

f    1 

«T   =l_*(2)Pz|Hl 

and  the  power  function   Pm  {Q,'^)     by 

Pm  (e.cD)  =  ^  <t>(z)P'z|H^  !> 


where   Pcz|H.  <     represents  the  probability  that   Z  =  z  under 
H.j  i  =  0, 1,   and  the  siinuTietion  is  over  all  possible  z. 
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We  turn  now  to  the  case  in  which  the  form  of  G  is  Incomplete- 
ly Icnown;  and  to  the  corresponding  problem  of  constructing  admissi- 
ble rank  tests  (ARTs)(with  respect  to  specified  families  f\     of 
shapes),  with  attention  again  focused  on  a  neighborhood  of  (l>  . 
Let   n=  \{Q,7\)j     represent  a  specified  class  of  c.d.f.'s 
G(x,  0,  A)  =  Gq{x,A),   where  the  range  of  ©  is  an  open  interval 
and  A  iz  A^   3  specified  family  of  "shapes".   For  any  test  6,   we 
shall  refer  to  the  negative  of  the  derivative  of  the  pov/er  function, 


-p^(VV^)  =  -^pg^V*'^) 


o 


as  the  risk  function  r  =  r(5,A),  A  ~  /\. 

Let  ^^^  be  the  class  of  rank  tests  5(z)   such  that  the 

probability  of  a  type  I  error  ^^('^o' *o' '^  ^  "^  °'     ^^i^o^™ly  ^^  ^' 
Admissibility  within  ^>        is  defined  m  the  usual  way  in  terms  of 
the  risk  function  r(6,A).   We  consider  next  the  problems  of  char- 
acterizing the  admissible  tests  and  of  examining  their  risk  points. 

Let  J(A)  be  ariY  a  priori  distribution  over  A*  "We  call 
any  test  5j  ^  -^^  a  Bayes  solution  with  respect  to  J  if  5, 
minimizes  the  Bayes  risk 


r(5,J)  =  -  /  P^(4>^,4>Q,A)dJ(A) 


over  all  5  ':L£  X ^.      For  any  fixed  J,   consider  the  family  of  dis- 
tributions  G(x,©,  J)  =  /  G{x,  0,  A)dJ(A)   and  our  testing  problem 
concerning  0  applied  to  this  family.   We  have  for  any  6 


-otsIqjTtoor.f 


;0 


.->  a£ 
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"(5,  J)  =  /  r(5,A)dJ(A)  =  -  2ZI  5(z) 


y^p^  (2ho>*.^'^ 


/^-. 


(t>=* 


dJ(A) 


o 


which  by  use  of  Hoefding's  theorem  reduces  to 


r(5,J)  =  2 5(z)t;J{z), 


where 


T^^(z)  =  /  Ti(z)dJ(A). 


y. " 


Thus  r(5,J)   is  minimized  essentially  uniquely,  among  all  tests 


in  ^^,      by 


5j(z)  = 


0   ^f  ^N  ^  ^a  ' 


1   if  T^  <  C^  , 


^a  ^f  Tj^  =  ^a  • 


Here  "essentially  unique"  means  that  any  other  test  minimizing 
r(5,J)  has  the  same  risk  function  as   5-.   Tlie  admissibility  of 
each  such  5-  follov/s  Immediately  from  the  vjell  known  property  of 
admissibility  of  essentially  unique  Bayes  solutions. 
The  foraiula  for  T,,  above  reduces  to 


■^N^^'  "  n  - 


N 


1  i=l 


^•Ni^Ni  ' 


where 


i ( a} tb 


2J 


4i  'J  4i«J(^)  • 


This  formula  admits  the  interesting  and  useful  interpretation  that 
each  of  the  ARTs  6,  is  based  on  a  linear  combination  of  the 
Zj^.'s,   resembling  the  LMPRTs  in  this  respect;  and  that  the  "scores" 
a^.      defining  5-  are  respectively  simply  the  J-v.'eighted  averages 
of  the  corresponding  scores  a^V.,   X  £1 /\ .   For  example,  if  J 
gives  equal  weights  ^     to  the  normal  and  logistic  shapes,  then 
5j  is  given  by  scores  a^,^  =  ^  aj^^^  ■*■  2  ^Ni  '   where  a^^^^  is  a 
"normal"  (Fisher  test)  score  and  a,,   is  a  V/ilcoxon  test  score, 
i  =  1, . .  .N. 

Because  each  ART  5,  has  this  form,  it  is  possible  to  apply 
familiar  techniques  to  compute  its  asymptotic  efficiency  under  each 
shape  7\  Ez  /\.      As  is  the  with  LMPRTs,  for  ARTs  it  is  only  asymp- 
totic approximations  to  power  functions  which  are  now  practically 
available  in  numerical  form.   These  are  the  subject  of  the  follow- 
ing subsection  5. 

It  is  useful  in  connection  with  investigations  of  specific 
problems  to  note  that  the  set  of  risk  functions  r(5  ,A),  A  ^'^ /\. , 
of  the  respective  ARTs  constitute  a  subset  of  the  "lower"  boundary 
hypersurface  of  the  convex  set  of  risk  functions  of  all  tests  in 
^^.   The  convexity  of  the  latter  set  follows  by  a  familiar  ele- 
mentary argument  from  the  fact  that   /'   is  a  convex  set  of  crit- 
ical functions  and  the  observation  that  r(5,A)  is  a  linear  func- 
tional of  5.   Under  general  but  not  universal  conditions,  of  which 
a  simple  case  is  that  /,   contains  a  finite  number  of  points,  it. 
is  known  that  the  admissible  class  is  a  complete  class  in  -^ ^,      and 
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that  therefore  its  risk  functions  constitute  the  full  "lower" 
boundary  hypersurface  mentioned.   Evidently  a  correspondinc;  con- 
vexity property  holds  for  asymptotic  relati^ve  efficiency  functions. 

3.5  Asymptotic  Relative  Efficiencies  of  ARTs.   It  is  convenient, 
and  represents  no  essential  restriction  of  methods,  to  present  der- 
ivations and  examples  here  for  the  case  in  which  /\  contains  two 
points  only,   X  =  1,2.   We  ass\ame  here  that 

(i)  gQ(x,?v)  and  %-QgQ(x,A)  are  continuous  with  respect  to 
e  in  R  for  almosc  all  x,   and  gQ(x,A)  and   -^  gg(x, A)   are 
dominated  by  functions  which  are  Riemann  integrable  on  the  real 
line, 

(ii)  ggCxjA)  and  g,j,(x,A)   have  the  same  support, 

1 


(iii)  |J^-^H)|  = 


d^JG 


A 


dH^ 


-1-  =■  +5 
<  K^H(l-H)    "^   ^  for  0  <  H  <  1, 


'A 


A  =  A,,Ap   (except  perhaps  for  a  finite  n\iraber  of  H  where  J\ 

may  fail  to  exist)   for  some  5  >  0,   k,   a  constant  and  J„ 

7 
defined  by 


(1) 


A   o 


9=* 


o 


and  v;here 


0  <  lim 


n. 


n. 


=  r  <  CO 


N  — >  CO  ''2 


Let   Gq(x,A^)  =  Gq(x)  and  Gq(x,A2)  =  G*(x). 


CW^     vj 


c-^ 


nc  9. 
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Also  let 


J     =  pJq  +    (I-p)Jq^j        0   <   p   <   1    . 


THEOREM;      Under  assumptions    (i),    (ii)   and   (ill), 

tP- 


lim       P 

N    — >     00 


G    1 


^'^Ofh  frrP\  -        I         J       ,/27r 

J  "- 


2 

X 

2 


■'©^(tP) 


dx 


where 


00 


^e<i>^^N^ 


=/ 


Jp(HQ^(x))dGQ(x) 


and 


^0(^1^=-!^        //     G,(x)(l-G^(y))J^(HQ^(x5U^(HQ^(y}dGQ(x)dGg(y) 


[-co<x<y<co 
"2         f 


1 


HT      J     J      GQ(x)(l-G3(y))J^(HQix))J^(H^^(y))dG<,(x)dG,(y) 

-03<X<y<00  J 


and  v;here 


He*(^)  =¥^Gq(x)  ^-^g,(x) 


providing  ^^(T^)  /^  0.   In  particular  under  the  null  hypothesis 

2       2 
for  G  or  G*,  c^-^  ^     =  o-^       is  given  by 

o  o     o 


■^  '. 
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n,N 


This  result  is  an  immediate  consequence  of  a  theorem  of 
a-iemoff  and  Savage  (I958).   Ansari  and  Bradley  (I96O)  have  pointec 
out  that  if  a  finite  number  of  exceptional  points  are  allowed  in 
condition  (iii),  the  theorem  remains  valid. 

Obviously  the  theorem  remains  true  if  we  replace  Gq(x)  by 
G*(x)  above.  We  use  this  result  to  calculate  ARE's  of  T^  and 
ART'S. 

For  purposes  of  investigating  the  asymptotic  properties  of  the 
admissible  tests  T^  we  use  the  Pittman  (19^0)  Noether  (1955)  cri- 
terion for  the  ARE  of  two  sequences  of  tests  W  =  ,W  n  and 
W*  =  sW*'-'.   Consider  the  sequence  of  alternatives  A„  =  G.^  -  *„ 

=  kN    v;here  k  is  a  non-zero  constant,  -g — _'  ^     ~  ~  n~ 


and 


II    o     ^'2 


A  =  e  -  <!>.   Let 


n  n* 

lim   -—  =  lim   ^  =  r. 
N  _>  CO  "2   N  ->  00  "2 


Suppose  that  V/  and  W*  have  the  same  size.   The  ARE  of  W  to 

N* 
W*  is  defined  to  be   lim  -^     where  N*  is  the  sample  size  of 

N  ->  00  ^^ 
the  second  test  required  to  achieve  the  same  power  for  a  given 

alternative  A,y  v/hich  VL,  achieves  with  respect  to  the  same  al- 
ternative when  using  a  sample  of  N  observations. 

Consider  such  a  sequence  of  alternatives  v^ith  A  =  0  -  <t>  and 

I   '■  ''1 

a  sequence  of  statistics  V/  =  <  W,.  "^  and  W*  =  sW*>   satisfying  the 
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following  three  conditions  in  the  neighborhood  of  A 

K-^e^^^N^  ,  A  ^   /'_i    "' 


(a) 


0(4>=e=<!'^): 


lim   P 

N  — >  00 


^  [     e»^V  =    J  :^  ./2? 


J 


e    '^  dx 


(b) 


lim 

N  — >  CO 


o 


=  1 


(c) 


\i^) 


=   lim 


f  __k_^e*lV 


n^n^ 


1 
2 


V  N 


y 


o 


i2 


e 

A 


=  0    ' 


exists  and  is  independent  of  k. 

E„{F)   is  called  the  efficacy  of  W  at  F  and  F  refers  to 
the  distribution  under  v/hich  all  expectations  above  are  taken. 

Pittman  has  shov/n  that  if  (a),  (o)  and  (c)  are  satisfied  for 
W  =  Wj^  j^  and  VJ*  =  \  W*  ;  then 


lim 


N 


^w,w*(^^ 


(where  E^,^  w*^^^  ^^  ^^®  ^^  °^  ^^  *°  ^*^  ^^  \*(^^  /^  *^  ^'^^ 


n 


lim 
N  — >  c» 


n 


1       "l 

^  =  lim  ^  =  r. 


r   1 
Consider  now  T  =  '"^t'*   From  the  theorem  we  have  that  con- 

dition  (a)  is  satisfied.   Capon  (I961)  has  shown  that  for  LMPRT's 

condition  (b)  holds  and  the  same  argument  shov/s  that  the  same  is 

tme  for  T^. 
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To  check  condition  (c)  we  calculate  the  relevant  efficacies. 
We  evaluate 

n.N  n       ^  i^  r,  f    .^  \   2 

o  \o  / 

The  last  integral  gives 

/  J  (x)dx  =  P  /  JQ(x)dx  ^-  (1-p)  /  J^^CrOdx. 


Now 


1 
f 


J   jQ(x)dx  =J    jQ(G)dG 


G 


4) 
o 


=    ^*Q       -^        ^°S     (X) 


=  0 


This  last  result  follovjs  from  an  interchange  of  integration  and 
differentiation  which  may  be  carried  out  under  the  assumptions  of 
the  theorem.   Similarly  we  find 


J   jQ^(x)dx  =  0 


Tlius 
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^  ^^  Ij^^^  =/  4^-^^ 


=  p^y  jQ(x)dx  +  2p(i-p)  y  jq( 


x)jQ^(x)dx  +  (1-p)' 


J 


'G*(x)dx. 


Substituting  x  =  G^  (y)  in  the  first  integral  and  x  =  G^  (y) 


o 


o 


in  the  last  integral,  we  get  finally 


o 


^  Ip^lnf  G^   +  2p(l-p)y  J(.Jx)jQ^{x)dx  +  (l-p)2lnf  G* 
1   [        °         o  °i 


where  inf  G^   is  R.  A.  Fisher's  "information"  of  Gq  evaluated 
at  e  =  «f  .   Notice  that  this  is  the  variance  of  Ti;  under  both 

O  JM 

G^       and     G|   .      To  calculate  the  efficacy  v/e  need 


i,E,,(TP|G*)  and     |rE^jTP|G) 


^  eo^-^N' 


A=0 


M  "e*^"Ni 


A=0 


By  the  mean  value  theorem 


n^ 


H«   (x)   =  a»(x)  +  -f  (G|(x)   -  0»(x)) 


=  GS('^'  -^Iro^f^' 


'e' 


for  some      *. 


H.= 


e  >  <i)   >  0.      Thus 


00 


Eect>('^|lG*)  =/ 


r 

Jp    ,C-(x) 


N     a|.t     |a 


\ 


dG*(x). 
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Another  application  of  the  mean  value  theorem  yields 


CO 


ngA 


CO 


Jl(G*(x)) 


M. 


=$   P  t^ 


..  dG*{x) 

G*=G*   ^ 
\x      6 


^^2   ^^u 
where  G*  lies  between  G*  and  Gg  -  -||-  A  -^-f^ 


e 


'e 


M.=* 


Recalling  that  the  first  Integral  has  been  shown  to  be  zero, 
we  have 


^'^   -\      00     '  -N^ 


e=<t) 


Integrating  by  parts  we  have  the  latter  equal  to 


N  ■ 


^  ^e 


G*(x) 


00  00 


+ 


*  Jp  ^1  (^^-) 

(t)        •t'  O 

O  "CO       —00 


o 


Jp   G^    (x)   dG|    (x) 


o 


or 


.^     '  05 


^  E@,(tP|G*)  =  ''-l-^;  y     JJ(G|    )Jp(G|    )dG|    (x) 


A=0         V 


-('V 


N.' 


V     ^ 


^G-(x)|,    J      (G|    (X)) 


-00 


V/e  conclude 
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E  (G*)    = 


00 


a 


o 


+'.{l-p)inf  G* 


p^  inf  G^  +  2p(l-p)  y  Jg(x)jQ^,(x)dx  +  (l-p)inf  G* 


Similarly 


E  ^(G*)  = 


p  inf  G^  +  (1-p)  J 
o 


Jq^{G^  )J(.(G^  )dG<j^  (x)- 


o    o 


o 


I  G©(x)V(G^  (x)) 
o 


p^  inf  G^  +  2p(l-p)  y  J(.(x)J^^(x)dx  +  (l-p)inf  G* 
o         "^  o 

Thus  the  conditions  (a),  (b)  and  (c)  are  satisfied  and  we  can  cal- 
culate ARE's  by  taking  ratios  of  efficacies.   These  calculations 
allow  us  to  prove  an  interesting  lemma  which  shows  surprising  sym- 
metry relations  among  these  tests. 

LEMMA:   If 


^  SI'-) 


SI  ^°e  sglx) 


O  —00 


■^   G*(x)  ^  -^   log  gQ(x) 


O  -oo 


then 
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T»T°   o     T°Tt    o 


Further  if  Inf  G^  =  inf  G|   then 

o       o 


V'°*'  =V-p'°'- 


Proof:   This  result  is  Immediate  when  one  notices  that 


00 


Jg*^%   ^^)J 


=  Jq^(1)  -  Jq^(O) 


o 


00 


-00 

00 


Jg(g|  (x)) 

o 


Jg(G,^{x)) 


and 


O        0     0  o       o     o 


Q.  E.  D. 


Example  1.      Suppose  /\  consists  of  two  elements,   A  =  1  denoting 
normal  c.d.f.^s  with  unit  variance  and  A  =  2  denoting  logistic 
distributions 


-(x-e) 


®      (1+e  ^'^  ®^) 


Then  the  LMPRT's  for  the  unknown  location  parameter  are  the 


55 


Fisher-Yates,  C^,      and  the  Wilcoxin,   Wj^,   respectively.   VJe 
easily  calculate: 


^  E(Wj^l Logistic)  =  i^ 


n^ 
5 


^E(W^i Normal)  =  -^  ^ 

/TT 

a  2   1  ^ 


W^  -  3  n^^N 


^E(C^1 Logistic)  =-7=^ 

•/TT 

^E(C^| Normal)  =^ 


2     "^2 


^N  "  "l^ 


Letting  ^(x)  represent  the  standard  normal  c.d.f.  we  have 

1  1 

/  J^^(x)J^  (x)dx  =y  (2x-l  )$'■••  (x)dx 


N 


00 


=  y  (2$(y)-l)yd$(y) 

—00 

CO 

=J   (2|(y)-l)y$«(y)dy. 

— OS 

Integrating  by  parts  and  noticing  that 

-yl'(y)  =F(y) 
we  have 


0 
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eo 


J  J^j{x)J^  (x)dx  =  2j  {fHy))'^dy 
o  -<» 


/2?  _^^     /2^ 


^  e-y  dy 


v/tt 


Letting  T^  =  pW^^  +  (l-p)Cj^  we  find 


E     ( Normal ) 


-2-  +  (l-p) 


E  _    (Logistic) 


t2 


^^  (i.p)2  +  2£(1::p) 


We  may  modify  these  distributions  by  a  change  of  scale  of 

one  of  them  to  give  all  cdf^s  In  O  the  same  information:   Let 

2  1 

the  variance  c   of  the  normal  random  variable  be  equal  to  •^. 


Then 


[pyi+(i-p)]' 


"niP  p  ( Normal )  =  —  _ 


'cr-D 


ir  V 


■'('--I) 


■  J  W4  .1.  ..-^ 


:^j;u  9. 
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E  ^   (Logistic) 
Tr„  w. 


Jp.(l-p),/|f 


N'^^N  [p^  +  (1-p)^  +  2p(l-p)/|  ] 


=  E  ,     ( Normal ) 


This  example,  together  with  Figure  4,  illustrates  the  symmetry  un- 
covered in  the  lemma. 

Example  2.   Suppose  O  again  consists  of  two  elements,   X  =  1: 
normal  cdf's  with  unit  variance,  and  A  =  2:   double  exponential 
cdf»s: 


L{x,e)  =  I 


Je^^^-®^    X.9 

1  -  |et^-^)   X  >  e 


with  density  function 


i(x,e)  =  |e''^"®l        -00  <  X  <  -  . 

Notice  that  both  distributions  have  unit  information  with  regard 
to  the  unknown  location  parameter.   The  LMPRT' s  in  this  case  are 
the  Fisher- Yates,   Cj^,   and  the  test  given  by  Bimbaum  (I962)  and 
investigated  by  Laska  (1962,a).   We  easily  calculate: 


^  E(Bj^  I  double  exponential)  =  ^ 


^E(B^| Normal)  =  /|  "^ 


:ar 


:nj 


;q-X)q2 


+      ;q-i. 


M  J 
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Flp-ure  k 


ASYMPTOTIC  RISK  POINTS  OF  T^J  =  pWj^  +  (l-p)C 


N 


E  ^   (Normal) 

1.0 
3/Tr  


.8 


.6 


(3/Tr,l) 
\ 
\ 

(1,3'V)  =  (1,.95) 


0 


y 


,/ 


V— 


.7 


•8     .9   I  1.0 
E     (Logistic ) 
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0-2     _"2_ 


-^  E(Cjj  I  double  exponential)   =  /-  -j^ 


^E(Cjj  I  Normal)   =  -jf 


,2  "2 


S  "  "1^ 


Also 

1 


/  J^(x)J^(x)dx  =  -J  <l)"^{<t)(x))d<t>(x)  +J  <l'~^(<l'(x))dO{x) 


00 

0 


IT 


Letting     T^,  =  pB,,  +  (l-p)C„     we  have 


N       ^"N   '    '^  ^'^N 


[p,/|mi-p)]^ 

E  „         (<!>)    = 


^'S  p2  +   (l-p)2  +  2  /|p(l-p) 


[pMi-p)/!] 


E  (L)    =  — 

^^'^N      p^  +  (1-p)^  +  2  yi  p(i-p) 


,->■-' 


=  f 


)^-«  \  - 
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These  calculations,  together  with  Figure  5,  provide  a  second  Il- 
lustration of  the  symmetry  uncovered  In  the  lemma. 
Example  3>   Let  /\   contain  the  three  shapes  represented  in 
Examples  1,2,   Then  the  asymptotic  risk  points  of  the  ARTs  consti- 
tute a  concave  surface  In  the  unit  cube;  Figures  k   and  5  represent 
edges  of  this  surface  lying  in  two  faces  of  this  cube.   Some  rele- 
vant statements  about  this  surface  follow  immediately,  and  others 
can  be  found  by  determining  a  few  points  in  the  third  ^dge  analo- 
gous to  these,  and  especially  by  computing  several  risk  points  in 
the  interior  of  this  surface  such  as  that  of 
Tjj  =  {1/3)C^  +   (l/3)Wj^  +  (V3)Bjj  . 

An  Interesting  open  problem  is  whether  an  ART  is  always  also 
a  LMPRT,  under  the  assumption  that  /'\     is  a  convex  family  of 
shapes  in  the  sense  defined  in  an  analogous  comment  in  Section  2 
above. 

5.4  Efficiency-robust  estimators.  The  general  method  of  con- 
structing confidence  limit  estimators  and  median-unbiased  point 
estimators  from  given  testing  methods  is  well  known,  and  has  been 
discussed  recently  by  Hodges  and  Lehmann  {I965)  for  the  class  of 
two-sample  rank  tests  on  which  they  base  an  approach  to  efficiency- 
robust  estimation.  The  comments  above  on  the  practical  and  theo- 
retical value  of  extending  theory  and  techniques  for  LMPRTs  to 
theory  and  techniques  for  ARTs,  have  direct  analogues  in  connectior. 
with  these  problems  of  estimation.   The  actual  constmictlon  of 
estimators  from  ARTs,  and  the  interpretation  of  their  efficiency 


>V^«/  A-'^VaiO  '*J^i\J 


.w  ax 


"  V' 


J.        »  J  ,  i. 


J  *  i^        4  1^- 
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Figure  5 


ASYMPTOTIC  RISK  POINTS  OF  T^  =  pBj^  +  (l-p)C 
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properties,  do  not  require  special  discussion  since  these  consist 
of  reinterpretation  of  test-efficiency  properties  in  the  context 
of  corresponding  estimation  problems  as  in  the  reference  above. 
It  should  not  be  overlooked  that  despite  validity-robustness  and 
efficiency-robustness  properties  which  may  be  established  for 
estimators  based  on  LMPRTs  or  ARTs,  there  remains  the  important 
question  whether  such  estimators  or  others  have  in  addition  robust- 
ness with  respect  to  the  strong  assiimption  of  identical  form  (apart 
frOTi  shift)  of  the  tv/o  distributions.   Conceivably  the  latter  ro- 
bustness problem  could  be  approached  by  an  extension  of  the  above 
methods.   Also  to  be  considered  in  connection  with  the  latter  ro- 
bustness question  is  the  possibility  of  estimating  shift  by  the 
difference  of  two  efficiency-robust  estimators  of  one  of  the  kinds 
considered  in  Section  2  above. 

k.        Efficiency-robust  unbiased  estimators. 
k.l     Introduction.   Consider  any  family  of  density  functions 
f(x,0,A),  0  €r:  O,   a  subset  of  the  real  line,   X  £1  A  •   The  problem 
of  unbiased  estimation  of  0  is  that  of  finding  estimators  t(x), 
satisfying  E(t(X)|0, A)  =  0  for  each  0, A,   for  which  var(t|0, A) 
takes  values  which  are  jointly  suitably  small  in  some  specified 
sense  for  0  Er  O,  AGE:A« 

The  case  in  which  7v  is  knovm  (or  in  which  no  nuisance  para- 
meter X  appears),  and  in  which  the  criterion  of  uniformly  mini- 
mum variance  (UMV)  is  adopted,  is  the  only  one  which  has  been  in- 
vestigated very  generally  and  systematically.   In  this  case  four 
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problems  have  been  of  principal  Interest: 

1.  Under  what  conditions  on  f(x, 0),   9  admitting  unbiased  es- 
timation for  0  cE  O,   do  UMV  estimators  exist?  It  has  been  shown 
(Rao  (19^5)),  Blackwell  (1947),  that  in  case  there  exists  a  com- 
plete sufficient  statistic,   t,   then  the  conditional  expectation 
of  s  given  t,   where  s  is  any  unbiased  estimator  such  that 
the  variance  of  s  is  finite  for  each  9,   will  be  UMV.   In  gener- 
al, when  no  such  statistic  exists,  there  may  or  may  not  be  a  UMV 
estimator  and  no  general  theory  now  covers  this  case. 

2.  Under  what  further  conditions  are  such  UMV  estimators  essen- 
tially unique?  It  has  been  shown  that  existence  of  a  complete  suf- 
ficient statistic  guarantees  essential  uniqueness.   If  none  exists 
no  general  theory  is  available. 

5.   Under  what  conditions  is  a  given  estimator  UIW?  A  sufficient 
condition  is  that  it  be  unbiased  and  have  variance  equal  identical- 
ly to  the  Cramer-Rao  lower  bound;  a  strictly  weaker  sufficient  con- 
dition is  that  the  estimator  be  a  function  of  complete  sufficient 
statistic.   But  the  latter  condition  is  not  necessary,  and  when  it 
fails  the  problem  is  sometimes  difficult  and  is  not  covered  by 
available  theory. 

4.   Can  a  randomized  unbiased  estimator  be  UMV?  A  negative  answer 
follows  immediately  from  the  Rao-Blackwell  Theorem. 

For  the  wide  class  of  problems  in  which  no  UMV  estimator  ex- 
ists (for  an  estimable  0),   the  general  goal  of  small  values  of 
var(t|0)  may  be  formulated  as  high  precision  at  or  near  some  spec- 
ified value   0   of  special  interest;  more  precisely,  a  locally- 
best  (LB)  estimator  has  been  defined  as  an  unbiased  estimator  which 
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minimizes  var(t|0  ),  where  0  Is  a  specified  value  of  ©. 
Stein  (1950)  investigated  the  four  problems  above  as  they  apply  to 
LB  rather  than  UMV  estimators,  giving  conditions  for  existence  and 
uniqueness,  and  a  functional  equation  characterizing  LB  estimators 
that  randomized  estimators  cannot  be  LB  is  seen  in  the  same  vray  as 
with  UMV  property  above. 

The  third  and  most  general  sense  of  smallness  of  values  of 
var(t|0)  has  not  previously  been  systematically  studied.   It  is 
that  of  admissibility  of  an  unbiased  estimator  t  defined  with 
reference  to  the  variance  function  var(t|G),   0  e  O.   Therefore 
we  present  here  generalizations,  using  this  criterion,  of  Stein's 
results.  Among  unbiased  estimators,  conditions  are  given  for  ex- 
istence and  uniqueness  of  Bayes  solutions  (with  risk  defined  as 
variance);  the  uniqueness  conditions  automatically  entail  admissi- 
bility.  A  theorem  on  completeness  of  the  class  of  Bayes  solutions 
is  given.   These  generalizations  are  presented  along  with  their 
further  generalization  to  the  case  of  principal  Interest  in  the 
present  paper,  that  in  which  an  additional  nuisance  (or  "shape") 
parameter  A  is  present:  We  deal  then  with  densities  f(x,0, A); 
unbiased  estimators  t(x)   satisfying  E(t|0, A)  =  0   for  all 
0£  >    and  A  S  A*   and  admissibility  of  such  estimators  definec 
in  the  usual  way  with  reference  to  their  variance  functions 
var(t  |0,A). 

No  examples  of  applications  of  these  generalizations  are 
given;  even  for  the  special  case  of  locally-best  estimators,  the 
present  writers  are  not  aware  of  any  non-trivial  applications  to 
examples. 
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4.2  Uniqueness  of  Bayes  solutions.   In  this  section  we  are  con- 
cerned with  admissible  unbiased  estimators  of  a  real  valued  para- 
meter ©  of  the  density  function  f(x,  GX),   ®  ^' ZH  »  ^ '^'  A> 
>    •"•  /  ,  =  O  where  x  Is  a  point  in  the   "^-finite  measure  space 
{R,r-f,\x),      Let  U  be  the  convex  set   (l.  e.   a6-  +  (l-a)5p  "•  U  If 
S-j^jSgG  U,   for  0  <  a  <  1)  of  unbiased  estimators  of  0.   We  as- 
sume at  the  outset  that  U  is  not  empty.   We  shall  be  interested 
in  those  5*{x)  c~  U  for  which 

r*(5*,J)  =    /  /  (6*(x)-e)^f(x,eA)dM.(x)dJ(e,A) 


=  inf   /  /   (5(x)-e)^f(x,eA)diJ.(x)dJ(e,A) 
=  inf   /  I   ,(5)dJ(e,A) 


where  ^  q^-^^^)   =  S^  j^[5(X)-^]^  and  J{9,\)     is  any  a  priori  meas- 
ure  over  O.   5*(x)  is  a  Bayes  solution  v;ith  respect  to  J.   The 
reason  for  this  interest  follows  from  the  fact  that  if  o*(x)  is - 
the  unique  estimator  minimizing   /   ^  ,(5)dJ(9A)  for  any  distri- 
but ion  function  J(eA)  over  O,  then  5*(x)  is  admissible. 

Define  J     f (x, eA)dJ(ex)  =  fj(x).  We  now  give  a  condition  for 
uniqueness  of  Bayes  solutions: 


[f(x,eA)] 
for  some  5.^  U  the  Bayes  risk  r*(6,J)  is  finite,  then  there 


f     lf(x,eA)]" 
LEMMA  4.1.   If  J      ■  ^  ^^^ dn(x)  <  «  for  all   (e,A)  ::  O  and  if 


u  > 


v.-      ■--' 


O       «       'i- 


n  '-T/^ 


-   ( 


{.^..'^y'  '-■•     dCa-.-^..,;  ^     ^••'-(x)6) 


H^O'^' 


or 


kk- 


exists  5*(x)  such  that 


f  <rl   ,(5*)dJ(eA)  =  inf   /  a-l   ,(5)dJ(eA) 

Moreover  6*  Is  essentially  lonlque,  1»  e,,  up  to  an  almost  every- 
where (n )  equivalence o 

Proof:   For  any  5  s  U  the  risk  is  given  by 

r*(5,J)  =  /  /   {5(x)-e)^f(x,e  X)dM,(x)dJ(e>;) 


=  /  /  5^(x)f(x,eA)du.(x)dJ(©A)  -  [    e^dJ(0A) 


<  «  . 


The  problem  is  thus  equivalent  to  minimizing 

/  /  5^(x)f(x,0A)dn(x)dJ{eA)  =  /  5^(x)f  ,(x)d|a(x) 
J  qJ  Yi  ^R       *^ 

subject  to 


/  5(x)f(x,eA)d|i(x)  =  e 


for  all  (e,A)s  O. 

The  remainder  of  the  proof  is  identical  with  that  of  Stein 
(1950)  who  considered  the  case  corresponding  to  an  a  priori  distri- 
bution over  O  with  all  of  its  mass  at  one  point  0  A  .   Stein's 
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description  of  admissible  imbiased  estimators.  We  shall  illustrate 
by  stating  a  generalization  of  his  "Principal  Theorem", 

THEOREM  4.1   (Generalization  of  Stein's  "Principal  Theorem"): 
Suppose  that  for  an  a  priori  distribution  J(e7\)  over  O  , 

J         f   (x) ^^^^  ""  ***  f  U)  ^^  ^i"i*®  ^°^  ^^1  (e,^)£  f^ 

and  almost  all  x  and  that  there  exists  an  \mbiased  estimator  6 
of  ©  for  which   /'>|  j^{6)dJ(eA)  <  »  .   Then  6*  is  the  Bayes 
solution  (essentially  unique  by  virtue  of  Lemma  4.1)  with  respect 
to  J  if  and  only  if  there  is  a  real-valued  functional  T  over 
the  set  G  of  functions  of  the  form 


R 
where  *(x)  satisfies 


TpiQA)   =  /  <l>(x)f(x,OA)dn(x) 


J   <t>2(x)fj(x)dM,(x) 


<   00 


such  that 


,^  r     f(x,e^,A)  "^ 

t(  /  i — f(x,eA)dn(x) !  =  e, 

\y        f,(x)  J       1 


J 

for  all   (e,>)^  O  and 


r  -\ 


T,  /  <I>(x)f(x,e?^)d(i(x)  ^  =  /  «J'(x)6*(x)fj(x)dn(x), 
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Stein's  corollaries  (pp.  409,  ^10)  can  also  be  generalized  in 
a  natural  manner  to  provide  explicit  methods  of  determination  of 
T  and  therefore  of  the  Bayes  solutions  which,  at  the  very  least, 
form  an  jinportant  subset  of  the  admissible  class. 

4.3  A  complete  class  theoremo  The  main  content  of  this  section 
Is  Theorem  4.2  which  asserts  that  under  certain  conditions  the 
class  of  Bayes  solutions  is  essentially  complete.  V/e  begin  with  a 
few  technical  lemmas. 

LEMMA  4.2.   Let   (u,~-^',  n)  be  a  o- -finite  measure  space  and 

(5^;  n  =  o,l,2...)  be  a  sequence  of  y  -measurable  functions  such 

that 

for  all  Bev  with  [i{B)   <  oo  .   Suppose   |6  (x)|  <  C  for  all 

n  =  0,1,2,...  X'TT.   R,   where  C  is  a  fixed  but  arbitrary  constant. 

Then 

J^   S^gdU  ->  J^   5^gdn 

for  all  gSL  (n),  the  space  of  absolutely  integrable  functions 
with  respect  to  the  measure  p.. 

For  proof  see  Dunford  and  Schwartz  (I961,  Problem  6,  p.  339). 

LEMMA  4.3  Let  O  be  a  compact  metric  space  and  (ir  ;  n  =  0,1,2,.,.) 
be  a  sequence  of  probability  measures  on  O  ,  In  order  that 


/  g(e)d7r  (e)  ->  /  g(e)d7r^(G) 


^u.     .Jl^. 


On£    QC 


.'J 


JO 
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for  all  continuous  g  on  O  it  is  sufficient  that  for  some  covering 
net  [Dj,    k  ] 

(For  the  dofinltion  of  a  covering  net  see  Wald  (1950^  P»  66.)) 

LEMMA  4o4.  Let  5(A.,x)  be  a  probability  measure  in  A  for  each 
X  and  measurable  for  each  A(Aeo,  x^^R).  Let  g(©M  be  a 
bounded  measurable  function  of  ©^  S  O,  Define  (for  y     any  proba- 
bility measure  on  R) 


v(A)  =  /  5(A,x)d7(x)  . 


R 
Then  v  is  a  probability  measure  on  O  and 


/  g(endv(eM  =  /  (  /  g(e»  )d6(e',x)l  dr(x) 

^O  *^R  \   ^-'O  y 


Note;  O  being  a  compact  metric  space  implies  that  the  Borel  sets 

form  a  o- -algebra. 

For  proof  see  Loeve  (1955»  Problem  13,  p.  1^0). 


THEOREM  4.2.   Let  ^ t)e  a  compact  subset  of  the  real  line,  and 

f(x, 9)  continuous  in  9.   Then  the  class  of  Bayes  solutions  with 


respect  to  a  priori  distributions  G(©)  over  >    is  essentially 
complete  v/here  the  space  of  decision  functions  U   is  the  class 
of  unbiased,  randomized  estimators,   6(x,A).   (That  is,   5(x,A)  is 
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a  probability  measure  over  >  for  each  x  and  measurable  for 
each  measurable  set  A,  A  Z  >  ,  x  ££  R.  )  Further,  If  "'^  is 
convex  then  any  purely  randomized  decision  function  is  Inadmissible, 

Proof:   It  is  sufficient  to  rhow  that  the  assxwiptions  of  VJald 
(1950,  3.1  to  3.7)  are  satisfied.   Except  for  the  requirement  that 
the  class  of  decision  functions  be  closed  in  the  sense  of  "regular 
convergence"  (Wald  (1950*  Assumption  3*6  (11),  p.  68)),  the  verifica- 
tion of  the  other  requirements  is  analogous  to  that  in  the  well- 
known  case  of  mean  squared  error  loss  function  without  restriction 
to  unbiasedness.   Let  5(x)  be  a  member  of  the  class  U   of  ran- 
domized, unbiased  estimators,  1.  e.,  for  each  x,   5(x)  is  a 
probability  measure  over  O  .   Let  D  be  any  measurable  subset  of 
O  .   For  fixed  x  and  a  given  decision  function  6,   let  us  denote 
the  probability  that  the  decision  lies  in  D  by 
5(x,D)  =  /_  d5(x, ©).  To  say  5(x)  Is  unbiased  is  to  require  that 


/    /    ©»d5(x,e') 


f(x,e)d|j,(x)  =  e     for  all  Gci:  > 


We  now  show  that  the  class  of  randomized  estimators,  U   is 
closed  in  the  sense  of  regular  convergence.  Let   [D  ,     ,  ]  be 
a  covering  net  of  2 ^^'^   suppose  that 


^^    /^  5n(\...k  '^)^(^)  =/,  ^^\...k  >^)d^x(x) 
n  — >  00   Rq      1    m  "S        ^ 

for  every  D,     ,    in  the  net  and  every  set  R„  S  v'  of  finite  p. 
1    m 
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measure,  where   [5  (x)]  is  a  sequence  of  randomized  unbiased  esti- 
mators. Then  we  must  show  that  6{x)  is  a  member  of  U^. 


From  Lemma  ^,2,   with  c  =  1,  we  get 

/  5^(A,x)g(x)dn(x)  ->  /   5  (A,x)g(x)dn(x) 
J  j^  n  ^  R  ° 

for  all  gtLL^(ia)  and  A  til  n« 

In  particular  taking  g(x)  =  f(x, G)  we  have 

/  5  (A,x)f(x,e)dii(x)  ->  /  5^(A,x)f(x,e)dn(x) 
Jp^  n  ^  R  ° 

for  all  0  and  all  A  tE  ti.   Write 

dPg  =  f(x,©)d|a 
and 

v®(A)  =J    5^(A,x)dPQ(x)  . 

Rewriting  the  above  we  have  for  each  arbitrary  but  fixed     9 


v®(A)  ->   v®(A)  for  all     A  6^  tj   . 

no 


Hence,  by  Lemma  k,J)   for  any  continuous  function  g(©M  over  2. 

J       g(e«  )dv®(©« )  ->J      g(e»  )dv®(e« ) 
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(where  in  Lemma  4,3  we  have  let  t     =  v  ). 

Now  v®(A)  =  /  5^(A,x)dP-(x}.   Letting  7  =  P^  in  Lemma  kA 
yields 


r  r        '~  (  *^ 

J        g(eMdv®{eM  =y    [j       g(eMd5^(9»,x) 'i    dP^Cx)  . 


/   g(eMd5^(9»,>- 

R  '^J_  J 


Therefore 


^      J       g(e»)d5^(G»,x)  d?Q{x)->J      J      g(e»)d5^(e«,x)  dP^ 

Writing  g(e» )  =  e»  for  all  e^  (^  >    gives 

/   /    0M6(x,e' )f(x,e)dii(x)  =  e     f  or  all  0  ^  T"" 

^R  ^t; — 

and  5(x)  is  an  unbiased  randomized  estimator.  Thus  U   is 
closed  in  the  sense  of  regular  convergence  and  the  assumptions  of 
Wald  are  satisfied.  Therefore  essential  completeness  is  proved. 

Frequently  it  will  not  be  necessary  to  use  a  randomized  esti- 
mator for  one  may  apply  the  Rao-Blackwell  theorem,  (Rao  (1945), 
Blackwell  (194?)).   That  is,  for  any  fixed  x,   5(x)  defines  a 
probability  measure  over  >   •   Consider  for  use  as  an  estimator 
the  conditional  expected  value  of  9,   under  this  distribution, 
given  X,   If  )    is  convex  this  new  non-randomized  estimator  is 
unbiased,  it  has  variance  less  than  or  equal  to  that  of  its  gener- 
ator, the  randomized  estimator,  which  is  therefore  inadmissible. 
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The  proof  Is  now  complete. 

An  extension  of  this  theorem  to  the  case  pertinent  to  robust- 
ness is  immediate.  Introduction  of  the  "nuisance"  parameter  A 
disturbs  only  the  property  that  Waldos  assumption  3,7  (Wald  I95O, 
p.  SG)   is  satisfied.   Let 


V  =  ]f{x,e,A)|  A 


A,  e 


and  let  t^E^V  ',     n  =  1,2,3... 


and 


t„(r.,f)  =  sup 

^  "     M  c:  R 


I    [f^(x,e,A)  -  f(x,e,A)]dn(x) 

J  yi       n 


where  M  is  any  ix -measurable  subset  of  R.  Assiimptlon  3»7  con- 
cerns itself  with  compactness  of  V   in  the  sense  of  the  metric 

t_  and  the  property  that  if  f  — >  f  in  the  sense  of  t  ,  then 
r  n  r 


sup 


w(f^  ,e)  -  w(f,e) 

o 


->  0 


where  W  is  the  loss  function  (©-©)  .  Since  assiomptions  3.1  to 
3.6  are  not  affected  by  A  we  have  a 


COROLLARY.   If  f(x, e,A)  is  a  family  of  density  functions,   A  e  A, 
where  /\  contains  finitely  many  points  such  that  for  each  fixed 
A  the  assumptions  of  theorem  4.2  are  satisfied,  then  the  class  of 
Bayes  solutions  with  respect  to  a  priori  distributions  over 


t^r^\j) 


.JsJw  K-\  ^         Ca 


:toI  fens 


(x)i4b[(A,h',x)'i   -    lA^c 


:;:^_. -iii-j^c;    iJ.    \o 


OJTJ-Cl 


,■,'■<■         ♦^    -, 


ri-y^. 


52 


X  /..  is  essentially  complete  with  respect  to  the  class  of 
randomized  unbiased  estimators. 

The  proof  of  the  corollary  follows  immediately  from  the  fact 
that  the  assumptions  5«l-3»9  are  satisfied  in  the  saiae  way  as  in 
the  case  when  /^   contained  only  one  point.  We  summarize  with 
the  following  corollary. 

COROLLARY,   If  for  each  J  the  conditions  of  Lemma  ^.1  and  the 
conditions  of  Theorem  4.2  are  satisfied  and  if  /.  contains  finite* 
ly  many  points  then  the  class  of  Bayes  solutions  forms  an  essential- 
ly complete  class  of  admissible  estimators. 
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